AI Model Inference (stable:2025-04-01)

2025/04/08 • 4 new methods

GetChatCompletions (new)
Description Gets chat completions for the provided chat messages. Completions support a wide variety of tasks and generate text that continues from or "completes" provided prompt data. The method makes a REST API call to the `/chat/completions` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /chat/completions
{
api-version: string ,
extra-parameters: string ,
body:
{
messages:
[
{
role: enum ,
}
,
]
,
frequency_penalty: number ,
stream: boolean ,
presence_penalty: number ,
temperature: number ,
top_p: number ,
max_tokens: integer ,
response_format:
{
type: string ,
}
,
stop:
[
string ,
]
,
tools:
[
{
type: enum ,
function:
{
name: string ,
description: string ,
parameters: object ,
}
,
}
,
]
,
tool_choice: string ,
seed: integer ,
model: string ,
modalities:
[
string ,
]
,
}
,
}

⚐ Response (200)

{
id: string ,
object: enum ,
created: integer ,
model: string ,
choices:
[
{
index: integer ,
finish_reason: enum ,
message:
{
role: enum ,
content: string ,
reasoning_content: string ,
tool_calls:
[
{
id: string ,
type: enum ,
function:
{
name: string ,
arguments: string ,
}
,
}
,
]
,
audio:
{
id: string ,
expires_at: integer ,
data: string ,
format: enum ,
transcript: string ,
}
,
}
,
}
,
]
,
usage:
{
completion_tokens: integer ,
prompt_tokens: integer ,
total_tokens: integer ,
completion_tokens_details:
{
audio_tokens: integer ,
reasoning_tokens: integer ,
total_tokens: integer ,
}
,
prompt_tokens_details:
{
audio_tokens: integer ,
cached_tokens: integer ,
}
,
}
,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetEmbeddings (new)
Description Return the embedding vectors for given text prompts. The method makes a REST API call to the `/embeddings` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /embeddings
{
api-version: string ,
extra-parameters: string ,
body:
{
input:
[
string ,
]
,
dimensions: integer ,
encoding_format: enum ,
input_type: enum ,
model: string ,
}
,
}

⚐ Response (200)

{
id: string ,
data:
[
{
embedding:
[
number ,
]
,
index: integer ,
object: enum ,
}
,
]
,
usage:
{
prompt_tokens: integer ,
total_tokens: integer ,
}
,
object: enum ,
model: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetImageEmbeddings (new)
Description Return the embedding vectors for given images. The method makes a REST API call to the `/images/embeddings` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /images/embeddings
{
api-version: string ,
extra-parameters: string ,
body:
{
input:
[
{
image: string ,
text: string ,
}
,
]
,
dimensions: integer ,
encoding_format: enum ,
input_type: enum ,
model: string ,
}
,
}

⚐ Response (200)

{
id: string ,
data:
[
{
embedding:
[
number ,
]
,
index: integer ,
object: enum ,
}
,
]
,
usage:
{
prompt_tokens: integer ,
total_tokens: integer ,
}
,
object: enum ,
model: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetModelInfo (new)
Description Returns information about the AI model deployed. The method makes a REST API call to the `/info` route on the given endpoint. This method will only work when using Serverless API, Managed Compute, or Model . inference endpoint. Azure OpenAI endpoints don't support i.
Reference Link ¶

⚼ Request

GET:  /info
{
api-version: string ,
model: string ,
}

⚐ Response (200)

{
model_name: string ,
model_type: enum ,
model_provider_name: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}